13 research outputs found
Unsupervised ensemble of experts (EoE) framework for automatic binarization of document images
In recent years, a large number of binarization methods have been developed,
with varying performance generalization and strength against different
benchmarks. In this work, to leverage on these methods, an ensemble of experts
(EoE) framework is introduced, to efficiently combine the outputs of various
methods. The proposed framework offers a new selection process of the
binarization methods, which are actually the experts in the ensemble, by
introducing three concepts: confidentness, endorsement and schools of experts.
The framework, which is highly objective, is built based on two general
principles: (i) consolidation of saturated opinions and (ii) identification of
schools of experts. After building the endorsement graph of the ensemble for an
input document image based on the confidentness of the experts, the saturated
opinions are consolidated, and then the schools of experts are identified by
thresholding the consolidated endorsement graph. A variation of the framework,
in which no selection is made, is also introduced that combines the outputs of
all experts using endorsement-dependent weights. The EoE framework is evaluated
on the set of participating methods in the H-DIBCO'12 contest and also on an
ensemble generated from various instances of grid-based Sauvola method with
promising performance.Comment: 6-page version, Accepted to be presented in ICDAR'1
Challenges and complexities in application of LCA approaches in the case of ICT for a sustainable future
In this work, three of many ICT-specific challenges of LCA are discussed.
First, the inconsistency versus uncertainty is reviewed with regard to the
meta-technological nature of ICT. As an example, the semiconductor technologies
are used to highlight the complexities especially with respect to energy and
water consumption. The need for specific representations and metric to
separately assess products and technologies is discussed. It is highlighted
that applying product-oriented approaches would result in abandoning or
disfavoring of new technologies that could otherwise help toward a better
world. Second, several believed-untouchable hot spots are highlighted to
emphasize on their importance and footprint. The list includes, but not limited
to, i) User Computer-Interfaces (UCIs), especially screens and displays, ii)
Network-Computer Interlaces (NCIs), such as electronic and optical ports, and
iii) electricity power interfaces. In addition, considering cross-regional
social and economic impacts, and also taking into account the marketing nature
of the need for many ICT's product and services in both forms of hardware and
software, the complexity of End of Life (EoL) stage of ICT products,
technologies, and services is explored. Finally, the impact of smart management
and intelligence, and in general software, in ICT solutions and products is
highlighted. In particular, it is observed that, even using the same
technology, the significance of software could be highly variable depending on
the level of intelligence and awareness deployed. With examples from an
interconnected network of data centers managed using Dynamic Voltage and
Frequency Scaling (DVFS) technology and smart cooling systems, it is shown that
the unadjusted assessments could be highly uncertain, and even inconsistent, in
calculating the management component's significance on the ICT impacts.Comment: 10 pages. Preprint/Accepted of a paper submitted to the ICT4S
Conferenc
Carbon-profit-aware job scheduling and load balancing in geographically distributed cloud for HPC and web applications
This thesis introduces two carbon-profit-aware control mechanisms that can be used to improve performance of job scheduling and load balancing in an interconnected system of geographically distributed data centers for HPC and web applications. These control mechanisms consist of three primary components that perform: 1) measurement and modeling, 2) job planning, and 3) plan execution. The measurement and modeling component provide information on energy consumption and carbon footprint as well as utilization, weather, and pricing information. The job planning component uses this information to suggest the best arrangement of applications as a possible configuration to the plan execution component to perform it on the system.
For reporting and decision making purposes, some metrics need to be modeled based on directly measured inputs. There are two challenges in accurately modeling of these necessary metrics: 1) feature selection and 2) curve fitting (regression). First, to improve the accuracy of power consumption models of the underutilized servers, advanced fitting methodologies were used on the selected server features. The resulting model is then evaluated on real servers and is used as part of load balancing mechanism for web applications. We also provide an inclusive model for cooling system in data centers to optimize the power consumption of cooling system, which in turn is used by the planning component. Furthermore, we introduce another model to calculate the profit of the system based on the price of electricity, carbon tax, operational costs, sales tax, and corporation taxes. This model is used for optimized scheduling of HPC jobs.
For position allocation of web applications, a new heuristic algorithm is introduced for load balancing of virtual machines in a geographically distributed system in order to improve its carbon awareness. This new heuristic algorithm is based on genetic algorithm and is specifically tailored for optimization problems of interconnected system of distributed data centers. A simple version of this heuristic algorithm has been implemented in the GSN project, as a carbon-aware controller.
Similarly, for scheduling of HPC jobs on servers, two new metrics are introduced: 1) profitper-core-hour-GHz and 2) virtual carbon tax. In the HPC job scheduler, these new metrics are used to maximize profit and minimize the carbon footprint of the system, respectively. Once the application execution plan is determined, plan execution component will attempt to implement it on the system. Plan execution component immediately uses the hypervisors on physical servers to create, remove, and migrate virtual machines. It also executes and controls the HPC jobs or web applications on the virtual machines.
For validating systems designed using the proposed modeling and planning components, a simulation platform using real system data was developed, and new methodologies were compared with the state-of-the-art methods considering various scenarios. The experimental results show improvement in power modeling of servers, significant carbon reduction in load balancing of web applications, and significant profit-carbon improvement in HPC job scheduling